Home
Call For Papers
Submission
Author
Registration
Publications
About
Contact Us

  An Improved Data Linkage Technique Based on Clustering Tree for Decision Making  
  Authors : Sathya. T,; Nithya. K
  Cite as:

 

The present state-run of the fine art in the data linkage is to match the entities from the different data sources which do not contain the common identifier. In that here one-tomany data linkage is considered to obtain the decision process based on the clustering tree. In prior work, there is no one-tomany data linkage tasks instead the issue addressed are to link among the same type of the entities. In this paper, two new splitting criterion are introduced to enhance the performance of the linkage process for the best split at each node during the decision tree construction process and securing the linked data from the unauthorized usage. Pruning techniques are implemented to remove the anomalies of the clustering tree.

 

Published In : IJCSN Journal Volume 3, Issue 6

Date of Publication : December 2014

Pages : 478 - 482

Figures : 02

Tables : --

Publication Link : An Improved Data Linkage Technique Based on Clustering Tree for Decision Making

 

 

 

Sathya. T : ME Student, Department of Computer Science and Enginerring KSR College of Engineering, Anna University, Namakkal, Tamilnadu 637215, India

Nithya. K : Assistant Professor, Department of Computer Science and Enginerring KSR College of Engineering, Anna University, Namakkal, Tamilnadu 637215, India

 

 

 

 

 

 

 

Data linkage

clustering tree

splitting criteria

pruning

In this paper, we are proposing the novel method to link the data which does not have the common entity and also based on the clustering tree. Based on this we can match the two different cluster of data from the different dataset. That was the main challenge in the data linkage here we are applying the one to many data linkage technique with the one class clustering tree in particularly database misuse domain. That was takes place by the decision tree technique. Here each node will be considered as the cluster of nodes and the whole data on the different dataset will be matched as the result. Here we attain the improved efficiency of the data linkage process.

 

 

 

 

 

 

 

 

 

[1] M.Dror, A.Shabtai, L.Rokach, Y. Elovici, “OCCT: A One-Class Clustering Tree for Implementing One-to- Many Data Linkage,” IEEE Trans. on Knowledge and Data Engineering, TKDE-2011-09-0577, 2014. [2] M.Yakout, A.K.Elmagarmid, H.Elmeleegy, M.Quzzani and A.Qi, “Behavior Based Record Linkage,” in Proc. of the VLDB Endowment, vol. 3, no 1-2, pp. 439-448, 2010. [3] A.J.Storkey, C.K.I.Williams, E.Taylorand R.G.Mann, “An Expectation Maximisation Algorithm for One-to- Many Record Linkage,” University of Edinburgh Informatics Research Report, 2005. [4] S.Ivie, G.Henry, H.Gatrell and C.Giraud-Carrier, “A Metric Based Machine Learning Approach to Genea- Logical Record Linkage,” in Proc. of the 7th Annual Workshop on Technology for Family History and Genealogical Research, 2007. [5] P.Christen and K.Goiser, “Towards Automated Data Linkage and Deduplication,” Australian National University, Technical Report, 2005. [6] P.Langley, Elements of Machine Learning, San Franc- Isco, Morgan Kaufmann, 1996. [7] S.Guha, R.Rastogi and K.Shim, “Rock: A Robust Clustering Algorithm for Categorical Attributes,” Informat- ion Systems, vol. 25, no. 5, pp. 345-366, July 2000. [8] D.D.Dorfmann and E.Alf, “Maximum-Likelihood Estimation of Parameters of Signal-Detection Theory and Determination of Confidence Intervals- RatingMethod Data,” Journal of 6, no. 3, pp. 487-496, 1969. [9] A.Gershman et al., “A Decision Tree Based Recommender System,” in Proc. the 10th Int. Conf. on Innovative Internet Community Services, pp. 170-179, 2010. [10] J.R.Quinlan, “Induction of Decision Trees,” Machine Learning, vol. 1, no. 1, pp. 81-106, March 1986. [11] C. Li, Y. Zhang, and X. Li, “OcVFDT: One-Class Very Fast Decision Tree for One-Class Classification of Data Streams,” in Proc. the 3rd Int. Workshop on Knowledge Discovery from Sensor Data, pp. 79-86, Paris, France, 2009. [12] P.Christen, “A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication,” IEEE trans. on knowledge and data engineering, DOI:10.1109, TKDE.2011.127, 2011.[13] N. Golbandi, Y. Koren, and R. Lempel, “Adaptive Boot-strapping of Recommender Systems Using Decision Trees,” in Proc. the 4th ACM Int. Conf. on Web search and data mining, pp.595-604, Honk Kong, 2011. [14] M. Gafny, A. Shabtai, L. Rokach, and Y. Elovici, “Detecting Data Misuse By Applying Context-Based Data Linkage,” in Proc. ACM CCS Workshop on Insider Threats, Chicago, USA, 2010. [15] S. Mathew, M. Petropoulos, H. Ngo, S. and Upadhyaya, “A Data-Centric Approach to Insider Attack Detection in Data-base Systems,” Recent Advances in Intrusion Detection, Spring-er, vol. 6307, pp. 382-401, 2009.